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r"' Abstract 

C^ This paper presents a general notion of Mahalanobis distance for functional data 

, ^, that extends the classical multivariate concept to situations where the observed data 

are points belonging to curves generated by a stochastic process. More precisely, a 

J> new semi-distance for functional observations that generalize the usual Mahalanobis 

QQ distance for multivariate datasets is introduced. For that, the development uses a regu- 

^^ larized square root inverse operator in Hilbert spaces. Some of the main characteristics 

.^ of the functional Mahalanobis semi-distance are shown. Afterwards, new versions of 

^^ several well known functional classification procedures are developed using the Ma- 

,—1 halanobis distance for functional data as a measure of proximity between functional 

J> observations. The performance of several well known functional classification proce- 

^ dures are compared with those methods used in conjunction with the Mahalanobis 

^ distance for functional data, with positive results, through a Monte Carlo study and 

the analysis of two real data examples. 

Keywords: Classification methods; Functional data analysis; Functional Maha- 
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1 Introduction 

At the present time, there are a number of situations in different fields of apphed sciences such 
as chemometrics, economics, image analysis, medicine, meteorology and speech recognition, 
among others, where it can be assumed that the observed data are points belonging to 
functions defined over a given set. Functional data analysis (FDA) deals with such kind of 
observations. In practice, the values of the functions are available only at a finite number of 
points and, as a general rule, functional samples may contain less functions than evaluation 
points. For this and other reasons, the majority of known multivariate tools can not be 
used for statistical analysis with this type of data since, by its nature, requires a different 
type of treatment. There are several methodologies for FDA being the most popular the one 
based on the use of basis functions such as Fourier and splines, see Ramsay and Silverman 
(2005). Alternatively, other procedures, such as the nonparametric approach proposed by 
Ferraty and Vieu (2006), do not require the knowledge of the explicit form of the functions. 
The ideas developed in this paper can be adapted to any of these situations. However, for 
easiness in exposition, the focus of this paper is on the basis functions approach. 

Even if usual multivariate methods are not usually well suited for functional datasets, 
many multivariate techniques have inspired advances in FDA. The introduction of the notion 
of distance for functional data represents an example. Usually, it is assumed that the set 
of functions has been generated by a functional random variable defined on a Hilbert space 
endowed with a certain distance. However, in the recent literature on functional data, there 
is little reference to the role played by distances between functional data, with the book 
of Ferraty and Vieu (2006) an exception. These authors have proposed semi-metrics well 
adapted for sample functions, including semi-metrics based on functional principal com- 
ponents (FPC), partial least-squares (PLS) type semi-metrics and semi-metrics based on 
derivatives. However, common distances frequently used in multivariate data analysis such 
as the Mahalanobis distance proposed by Mahalanobis (1936) have not been extended to the 
functional framework. The first contribution of this paper is to fill this gap and presents the 
funcional Mahalanobis semi-distance that extends the multivariate Mahalanobis distance to 
the functional setting. 

The use of distances in multivariate analysis is important in many different problems 
including classification, clustering, hypothesis testing and outlier detection, among others. 
In particular, several of the most well known methods for classification analysis are distance- 
based. Under a functional perspective, the aim of classification procedures is to decide 



whether a function xo generated from a functional random variable x belongs to one of 
G classes using the information provided by G independent training samples Xgi^ • • • 5 Xgng, 
where g = 1, . . . ,G. Here x^i, for i = 1, . . . , n^, are independent replications of the functional 
random variable Xj measured on ng randomly chosen individuals from class g. Using this 
information, a functional classification method provides a classification rule that can be used 
to classify Xo- Nowadays, there is a wide variety of methods developed to solve this problem. 
For instance, several papers have proposed to classify functional observations by means of 
the functional principal component scores. For instance. Hall et al. (2001) proposed a 
method that consists in obtain the functional principal component scores of the training 
samples, then estimate nonparametrically the probability densities of the sets of functional 
principal component scores and finally estimate the posterior probability that xo is of a given 
class using the Bayes classification rule. This approach was considered by Glendinning and 
Herbert (2003) for shape classification. Under a similar perspective, Leng and Miiller (2006) 
proposed a method of classifying collections of temporal gene expression curves by means 
of functional logistic regression on the functional principal component scores of the training 
samples. Also, Song et al. (2008) compared several multivariate classification methods 
on the the basis expansion coefficients of the training samples for classifying time-course 
gene expression data. On the other hand, the popular nearest neighbor classification rule 
has been also extended to functional data. For instance, Biau et al. (2005) proposed to 
filter the training samples in the Fourier basis and to apply the kNN method to the first 
Fourier coefficients of the expansion, while Baillo et al. (2011) derived several consistency 
results of the kNN procedure for a particular type of Gaussian processes. Additionally, 
the centroid method based on assign the function to the group with closer mean has been 
adapted to the functional framework by Delaigle and Hall (2012). Alternatively, several 
papers have extended the Fisher's discriminant analysis to the functional framework. The 
idea of these methods is to project the observations into a finite dimensional space where the 
classes are separated as much as possible. The transformed functions are called discriminant 
functions. Then, the new function xo is also projected in this space and it is classified using 
the Bayes classification rule. In particular, James and Hastie (2001) used a natural cubic 
spline basis plus random error to model the observations from each individual. The spline is 
parameterized using a basis function multiplied by a coefficient vector, that is modeled using 
a Gaussian distribution. The observed functions can then be pooled to estimate the mean and 
covariance for each class by means of an Expectation-Maximization (EM) algorithm that are 
used to obtain the discriminant functions. Alternatively, Preda et al. (2007) used functional 



PLS regression to obtain the discriminant functions while Shin (2008) considered an approach 
based on reproducing kernel Hilbert spaces. Finally, Ferraty and Vieu (2003) have proposed 
a method based on estimating nonparametrically the posterior probability that the new 
function xo is of a given class, Lopez-Pintado and Romo (2006), Cuevas et al. (2007) and 
Sguera et al. (2012) have proposed classifiers based on the notion of data depth that are 
well suited for datasets containing outliers, Rossi and Villa (2006) and Martin-Barragan 
et al. (2013) have investigated the use of support vector machines (SVMs) for functional 
data, Wang et al. (2007) have considered classification for functional data by Bayesian 
modeling with wavelet basis functions, Epifanio (2008) has developed classifiers based on 
shape descriptors, Araki et al. (2009) have considered functional logistic classification, and, 
finally, Alonso et al. (2012) have proposed a weighted distance approach. Note that, when 
a distance is required, these papers use the L^, L^ and L°^ distances which are well defined 
in Hilbert spaces. The second contribution of this paper is to show that several simple 
classification procedures including the kNN procedure, the centroid method and functional 
Bayes classification rules can be used in conjunction with the functional Mahalanobis semi- 
distance as the criterion of proximity between functions to get very good classification rates 
without the need of much higher sophisticated classification methods. Several Monte Carlo 
experiments suggest that methods based on the functional Mahalanobis semi-distance leads 
to better classification rates than other alternatives. 

The rest of this paper is organized as follows. Section 2 introduces the functional Maha- 
lanobis semi-distance and shows some of its main characteristics. Section 3 reviews several 
classification methods for functional data and provides new approaches to these methods 
based on the functional Mahalanobis semi-distance. Section 4 analyzes the empirical prop- 
erties of the procedures via several Monte Carlo experiments and illustrates the good behavior 
of the classification methods in conjunction with the functional Mahalanobis semi-distance 
through of the analysis of two real data examples. Finally, some conclusions are drawn in 
Section 5. 

2 The funcional Mahalanobis semi-distance 

2.1 Definitions and some characteristics 

This section presents the functional Mahalanobis semi-distance that generalizes the Maha- 
lanobis distance for multivariate random variables to the functional framework. Let x be a 



multivariate continuous random variable defined in MP with mean vector nix = E [x] and 
definite positive covariance matrix Cx = E [(x — iiix) (x — nix)'] • The Mahalanobis dis- 
tance between the random variable x and its mean vector nix is the Euclidean norm of the 
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random vector Cx (x — nix) that can be written (see, Mahalanobis, 1936) as: 



du (x, nix 



Cx^ fx- m. 



E n\ 

:^/' (x - mx) , Cx'/' (x - mx))' ' = [(x - mx)' C^Mx - mx)] '/' , 

where ||-||^ and (-, ■)^ denote the Euclidean norm and the usual inner product in W, respec- 
tively. The main characteristic of the multivariate Mahalanobis distance is that it takes into 
account the correlation structure of the multivariate random variable x. Moreover, the mul- 
tivariate Mahalanobis distance is scale invariant. For future developments, it is important 
to note that the Mahalanobis distance can be written in terms of the principal component 
scores of x. For that, let vi, . . . , Vp be the eigenvectors of the covariance matrix Cx asso- 
ciated with positive eigenvalues ai > ■ ■ ■ > ctp > 0, and let V be the p x p matrix whose 
columns are the eigenvectors of the covariance matrix Cx, i.e., V = [vi| ■ ■ ■ |vp]. Then, the 
vector of principal component scores given by s = V (x — nix), is a multivariate random 
variable with zero mean vector and diagonal covariance matrix. As a consequence, x can be 
written in terms of the principal component scores in the following way: 

X = nix + Vs. (2) 

On the other hand, the singular value decomposition of Cx, i.e., Cx = VAV, where A is a 
diagonal matrix with the ordered eigenvalues ai, . . . , Op in the main diagonal, allows to write 
the inverse of Cx in terms of V and A as follows: 

Cx' = VA-^V. (3) 

Now, ^ and (|3| leads to the following expression of the Mahalanobis distance between the 
random variable x and its mean vector nix in terms of the principal component scores: 

dM (x, mx) = (s'V'VA- VVs) '/' = (s'A-^s) '/' = (z'z)'/' , (4) 

where z = A~^' ^s is the random vector of standardized principal component scores. In other 
words, the Mahalanobis distance between x and nix can be written as the Euclidean norm 



of the standardized principal component scores. 

As mentioned before, the main goal of this section is to generalize the multivariate Ma- 
halanobis distance to the functional setting. However, the proposal does not lead to a func- 
tional distance but to a functional semi-distance. The reasons of this will be clear once the 
functional Mahalanobis semi-distance is presented. For that, let x be a functional random 
variable defined in the infinite dimensional space L^{T), i.e., the space of squared integrable 
functions in the closed interval T = [a, b] of the real line. It is assumed that the functional 
random variable x has a functional mean /ix(t) = ii^[x(t)] and a covariance operator F^ given 
by: 

F^(r/) = i?[(x-^J®(x-/^x)(^)]> (5) 

such that, for any i] G L^{T), 

(X - /^x) ® (X - f^x)iv) = (X - /^x' V) (X - ^x). (6) 

where (., .) denotes the usual inner product on L'^{T), i.e.: 

{x-^^x^v)= / ixit) - ^^xit)) v{t)dt. 



T 



The covariance operator F^ in (5 ) is a well-defined compact operator so long as i? [IIXII2] < ^^ 
(see Hall and Hosseini-Nasab, 2006), where ||.||2 denotes the usual norm in L'^{T). Un- 
der this assumption, there exists a sequence of non-negative eigenvalues of F^, denoted by 
Ai > A2 > ■ ■ ■ , where Ylk^i -^fc < C)0, and a set of orthonormal eigenf unctions of F^, de- 
noted by ipi,'ip2, ■ ■ ■ such that T^i^ipk) = ^k'ipk, for fc = 1, 2, . . . The eigenfunctions ipi, ip2, ■ ■ ■ 
form an orthonormal basis in L'^{T) and allows to write the Karhunen-Loeve expansion of 
the functional random variable x (see Hall and Housseini-Nassab (2006)), in terms of the 

elements of the basis as follows: 

00 

X = f^X + ^^k^k, (7) 

fc=l 

where Ok = {x ^ f^x^'^k), for k = 1,2, . . . are the functional principal component scores of 
X- It is well known that the functional principal component scores 6^, ioi k = 1,2, . . . are 
uncorrelated random variables with zero mean and variance A^ since t/^i, ^2, • • ■ are orthonor- 
mal. 

In order to obtain a similar expression to (fTl) in the functional setting, it is necessary 
to define the inverse of the covariance operator, F~^. It exists under certain circumstances. 
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However, even in this case, T~^ is unbounded and not continuous. Mas (2007) has proposed 
a regularized inverse operator which is a hnear operator "close" to F"^ and having good 
properties. For that, if F~^ exists, this is given by: 






At. 

fc=i '^ 



where C is a function in the range of F^^,. Then, the regularized inverse operator, denoted by 



F^^, is defined as: 



fc=i '' 

where K is a given threshold. Similarly, it is also possible to give a regularized square root 
inverse operator given by: 

rx^'(C) = E-^(V^.®V^.)(C), (8) 

fc=i -^fc 

that allows to define the functional Mahalanobis semi-distance between x and /i^ inspired 
on (jTl) as follows: 

Definition 2.1 Let x be a functional random variable defined in L'^{T) with mean function 
fi^ and compact covariance operator F^. The Mahalanobis semi-distance between x and fi^, 
denoted by d^j^^ixi l^x)' ^^ defined as: 



dpMix, /^x) = YK^^ix - /^x)' ^K^^ix - f^x)) 



1/2 



As noted before, the multivariate Mahalanobis distance may be expressed in terms of the 
principal component scores of the multivariate random variable x. Similarly, it is possible 
to express the functional Mahalanobis semi-distance in terms of the functional principal 
component scores of the functional random variable x as stated in the next proposition, that 
is proved in the appendix: 

Proposition 2.1 The functional Mahalanobis semi-distance between x and ^^^ can be writ- 
ten as follows: 

/ K \ V2 



where Uk = dk/^k ' f'^^ ^ ~ !> • • • > -^? ^'^^ ^he standardized functional principal component 
scores. 

Therefore, as in the multivariate case, the functional Mahalanobis semi-distance be- 
tween X ci-nd /i-^ is the Euclidean norm of the standardized functional principal component 
scores. This property provides a simple way to compute the functional Mahalanobis semi- 
distance in practice. It is also interesting to extend the definition of functional Mahalanobis 
semi-distance to the general situation of distance between two independent and identically 
distributed functional random variables. 

Definition 2.2 Let xi o-nd X2 be two functional random variables defined in L'^{T) inde- 
pendent and identically distributed with mean function fi^ and compact covariance operator 
r^. The functional Mahalanobis semi-distance between the functions Xi (ind X2, denoted by 
dFMiXi,X2), is given by: 

dpMiXi, X2) = (r^^^^(Xi - X2), ^K^^^iXi - X2) 

The previous definition leads to the following proposition proven in the appendix: 

Proposition 2.2 The functional Mahalanobis semi-distance between Xi and X2 can be writ- 
ten as follow: 

/ K \ 1/2 

dpMiXi^ X2)= [y^ (^i'^ ~ ^2fc)^ j , (10) 

where Uik = Oik/\,J and 002k = ^2fc/A/ , for k = 1,2, .. . are the standardized functional 
principal component scores of Xi and X2, respectively. 

Therefore, the functional Mahalanobis semi-distance between two independent and iden- 
tically distributed functional random variables can be written as the Euclidean distance 
between the standardized functional principal component scores of both functional random 
variables. The next result shows that rf^^ is indeed a functional semi-distance. 

Proposition 2.3 Let Xi, X2 and Xs be three independent and identically distributed func- 
tional random variables defined in L'^{T) with mean function fi^ and compact covariance 
operator T^. For any positive integer K , d^j^ verifies the following three properties: 

1- d^j^ixi,X2)>0. 



^- dFM{Xl,X2) =d^M{X2,Xl)- 

3- d^^j{xi,X2) < dFMiXl,X3)+dFMiX3,X2)- 

Consequently, dpj^ is a functional semi- distance. 



It is well known that if the multivariate random variable x has a p-dimensional Gaussian 
distribution, then it is easy to see that (i^^(x, nix) has a x^ distribution and, consequently, 
£■ [(i|f(x, nix)] = V and \^ [(i|^(x, nix)] = 2p. To end this section, the following theorem 
shows a similar result for the functional Mahalanobis semi-distance. 

Theorem 2.1 If x is a Gaussian process, (i^^j(x, /^x)^ ~ Xji, so that E [c?^m(X)A^x)^] ~ ^ 
andV[d^M{x,^^x?] ='2K. 

2.2 Practical impleraentation 

In practice, the functions are not observed continuously over all the points in the closed 
interval T = [a, b], so that calculation of the functional Mahalanobis semi-distances as defined 
in ([9]) and (10) is not possible. Assume now that a dataset is observed with the following 



form: 

{Xi iU,j) : i = 1, . . . , n and j = 1, . . . , Ji} , (11) 

where n is the number of observed curves and Jj is the number of observations of the function 
Xi at the points t^^i, . . . , ti^j-. Note that it is not assumed that the observation points are the 
same for all the functions not even their numbers. In this situation, the usual approach to 
obtain closed form expressions of the set of functions is to use basis functions. In general, a 
basis is a system of functions, denoted by 0m, for m = 1, 2, . . ., orthogonal or not, such that, 
for i = 1, . . . ,n: 

M 

Xi (t) ~ J^ |3^m<Pm (t) , 
m=l 

where Pim, for m = 1, . . . ,M, are the coefficients of the expansion. The number of basis 
functions, M, should be chosen on a case by case basis, although, M is usually chosen 
such that the functional approximations are close to the original counterparts with some 
smoothing that eliminates the most obvious noise. The choice of the basis is also important. 
There are several possibilities including polynomial, wavelets, Fourier and splines basis, 
among others. For periodic or nearly periodic datasets, Fourier basis is an adequate choice. 



For nonperiodic datasets, B-splines are typically used. See Ramsay and Silverman (2005) 
for more information on basis functions. The simplest method to effectively estimate the 
coefficients of the expansion is carried out by minimizing: 



Ji 

E 



M 



'Xi \'^i,j ) / J P'im^Pm v^ij , 



m=l 



1/2 



Now, with the smoothed functional sample, it is possible to estimate the functional mean 
fi^ with the sample functional mean, /i^, given by: 



/^x 



1 



n 



i=l 



and the covariance operator F^ with the sample covariance operator, F-^ [r]), such that, for 

any r] G L^{T): 

1 " 



n 



i=l 



Then, eigenfunctions and eigenvalues of the covariance operator F^ can be approximated 
with those of F^ leading to estimates iIji,iIj2, ■ ■ ■ and Ai, A2, . . . respectively. Therefore, the 
functional principal component scores corresponding to curve Xi, i-e., ^j,fc = {Xi '~ A^x? "^k), are 
estimated with 9i^k = \Xi ~ V-x^ i'k), for k = 1,2, . . . that allows us to define the functional 
Mahalanobis semi-distance between Xi and the functional sample mean Jl^ as follows: 

1/2 






where Uik = di^^/X^ , ioi k = 1, . . . , K , are the sample standardized functional principal 
component scores. Similarly, the functional Mahalanobis semi-distance between two func- 
tions of the sample, Xi and Xi', can be written as follows: 



K 



1/2 



d-FMyXii Xi 



7 ^ {^ik — ^i'kY 



,k=l 



1/2 



where Ui'k = Oi',k/Xk , ioi k = 1, . . . , K 
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3 Classification with the functional Mahalanobis semi- 
distance 

Among all the possible applications of the functional Mahalanobis semi-distance introduced 
in the previous Section, this paper focuses in the supervised classification problem in the 
functional setting. Consider a sample of functional observations such that it is known in 
advance that each function comes from one of G predefined classes. Therefore, the whole 
sample can be split in G subsamples, denoted by Xgii ■ ■ ■ i Xgug, ior g = 1, . . . ,G, respectively, 
where n = rii + ■ ■ ■ + no is the sample size of the whole dataset. Then, the idea is to use 
the information provided by the set of observations to construct classification rules that can 
be used to classify a new ungrouped functional observation xo- The aim of this section is 
to propose new procedures based on the combination of well known functional classification 
methods with the functional Mahalanobis semi-distance as a measure of proximity between 
functional objects. In particular, four procedures are presented. 

3.1 The k- nearest neighbor (kNN) procedure 

The k-nearest neighbor (kNN) procedure is one of the most popular methods used to perform 
supervised classification in multivariate settings. The method is very simple and appears to 
have a very good performance in many situations. Its generalization to infinite-dimensional 
spaces has been studied by Biau et al. (2005), Cerou and Guyader (2006) and Baillo et 
al. (2011), among others. The kNN method starts by computing the distances between the 
new function to classify, xo, and all the functions in the observed sample. Next, the method 
finds the k functional observations in the sample closest in distance to xo- Finally, the 
new observation xo is classified using majority of votes among the k neighbors. Cerou and 
Guyader (2006) have shown that the kNN procedure is not universally consistent. However, 
these authors have obtained sufficient conditions for consistency of the kNN classifier when 
the functional random variable takes values in a separable metric space. Additionally, Baillo 
et al. (2011) have shown that the optimal classification rule can be explicitly obtained for a 
class of Gaussian processes with triangular covariance operators. The previous papers have 
considered three functional distances for the kNN classifier: the L^, L^ and L°^ distances. 
In particular, the L^, L^ and L°° distances between Xo and the functional observation Xgi 
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ior g = 1, . . . ,G and i = 1, . . . ,ng are given by: 

di{Xo,Xgi)= / \Xo(t) -Xgi{t)\dt, 
Jt 

.1/2 
d2 (XO, Xgi) = [ I iXo (t) - Xgi {t)f dt 

and, 

doc. (Xo, Xgi) = sup {|xo it) - Xgi if)\ ■ t E T} , 

respectively. Note that in order to compute the L'^, L^ and L°^ distances it is necessary 
to first smooth the discretized values of the function xo as seen in Section 2.2. Also, it is 
important to note that no information about the class membership is used to compute the 
previous distances. 

On the other hand, the kNN classifier can be used in conjunction with the functional 
Mahalanobis semi-distance. Contrary to the previous distances, two different ways to com- 
pute the functional Mahalanobis semi-distance in classification problems are in order. In a 
first case, assume that the functional means under class g, denoted by fi^^, are different but 
the covariance operator, denoted by F^, is the same for all the classes. Then, the functional 
means, /i^ , are estimated using the functional sample mean of the functions in class g, i.e.: 

-, "s 
^9 i=l 

while the common covariance operator, T^, is estimated with the within class covariance 

operator given by: 

^ 1 '^ "® 

^xiv) = -J2Yl (^9* ~ ^x,^v) {Xgi - /^xj ' (13) 

^ 9=1 i=l 

for 7] E L"^ (T). Now, the functional Mahalanobis semi-distance between xo and the functional 
observation X(,j for (7 = 1, . . . , G and i = I, 

dpM (Xo, Xgi) = I V i^gOk - i^gik)' 1 , (14) 



where Ugok = dgok/^k ^^'^ ^gik ~ ^gik/^k ' respectively, are the standardized sample func 
tional principal component scores given by 6gok = \Xo — V^xa > ^k ) and Ogi^ = \Xgi~ V^xa ■> ^k 
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respectively. Here, t/'i, . . . , ^k and Ai, . . . , Ai^ are the eigenf unctions and eigenvalues of the 



sample within class covariance operator (13). Similarly, the functional principal components 
(FPC) semi-distance proposed by Ferraty and Vieu (2006) between xo and the functional 
observation Xgi for 9 = ^y ■ ■ ■ yG and i = 1, . . . ,ng, can be written as follows: 

/ A" _ _ 2\ ^^^ 

dpPC iXo, Xgi) = i^{dgOk-0gik J J , (15) 

where K' is a certain threshold. In a second case, assume that both, the functional means 
and the covariance operators, denoted by F^^, are different for the classes 1, . . . ,G. Then, 



the functional means, /i^ , are estimated using (12), while the covariance operator of each 
class is estimated using the functional sample covariance operator of the functions in class 
g, i.e.: 



-I "9 

Tx, iv) = —Yl (^9* ~ ^X9' ^> {Xgi - /^xj ' (16) 

^9 i=l 

for 7] E L'^ (T). Now, the functional Mahalanobis semi-distance between xo and the functional 



observation x^j for gf = 1, . . . , G and i = 1, . . . , n^,, is like in (14) but here w^ofc = ^gOfc/Aj, 
and ujgik = Ogik/\gk, respectively, where Ogok = (xo - 'j^xgi'^gk), Kik = (xgi - J^xa^'^gk) and 
'ipgi, . . . , 'ipgK and Api, . . . , XgK are the eigenfunctions and eigenvalues of the sample covariance 



operator in (16), respectively. Also, the FPC semi-distance in this second case can be written 



as in (15) but considering the same sample functional scores obtained with the eigenfunctions 



from the covariance operator (16), as before 



3.2 The centroid procedure 

The centroid procedure for functional datasets, proposed by Delaigle and Hall (2012), is 
probably the fastest and simplest classification method for functional observations. The 
centroid method consists in assigning a new function xo to the class with closer mean. Note 
that any functional distance can be used to implement the procedure. In particular, Delaigle 
and Hall (2012) considered the case of G = 2 classes that have different mean and a common 
covariance operator and proposed to project the functions into a given direction and then 
compute the squared Euclidean distance between the observations. More precisely, Delaigle 
and Hall (2012) proposed to use the centroid classifier with the distance between xo and the 



13 



sample functional mean /i^ , for g = 1,2, denoted by DH, and given by: 



doH (XOj^Xs) 



K" 



7 ^ ^0gk^l2k 
fc=l 



(17) 



where i^" is a certain threshold, u^gk is computed as in the previous subsection assuming a 
common covariance operator and 



>l2k 



^X2 -/^Xl'^fc) 



K 



1/2 



for A; = 1, ... , K". 

Of course, other distances can be applied in the general case of G classes. In particular, 
the L^, L^ and L°° distances and the two versions of the functional Mahalanobis and func- 
tional principal components semi-distances introduced before, between xo and the sample 
functional mean /i^^, for g = 1,. . . ,G, can be used. In particular, the semi- distances are 
computed similarly in the previous Section but replacing Xgi with /i^^. 

3.3 The functional linear and quadratic Bayes classification rules 

In multivariate statistics, the Bayes classification rule is derived as follows. Let a; be a 
j9-dimensional continuous random variable and let fi, . . . , fc be the corresponding density 
functions of x under the G classes. Let tti, . . . , ttc be the prior probabilities assigned to the 
G classes, verifying vri -|- ■ ■ ■ -|- ttg = 1 . Using the Bayes Theorem, the posterior probability 
that a new observation Xq generated from x comes from class g is given by: 

PigM = ., ,y^^^°^ , , y (18) 

respectively, where P{1\xq) -f • • • -|- P{G\xq) = 1. The Bayes rule classifies Xq in the class 
with largest posterior probability. In other words, Xq is classified in class g if Tigfg{xo) is 
maximum. In particular, if the fg densities are assumed to be Gaussian with different means 
m^ but identical covariance matrix Ca;, this is equivalent to classify a^o in class g if: 

c/m (iCo,ma;J -21og7r3 
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is minimum, where cLm [xo^nixg^ = [xq — tnxg) C^^ (a^o — ma;g) is the squared Maha- 
lanobis distance between Xq and nia. . 

Under the functional framework, the idea is to consider a similar rule but replacing the 
multivariate Mahalanobis distance with the functional Mahalanobis semi-distance. Conse- 
quently, assuming different means and a common covariance operator, the new observation 
Xo is assigned to the class g if: 

c?fM(xo,/^xJ^-21og7rg (19) 

is minimum. Note that the values of rcg are usually fixed as the proportion of observations in 
the sample in the classes. In particular, if tti = ■ ■ ■ = ttg, the linear Bayes classification rule 
reduces to the centroid classifier with the functional Mahalanobis semi-distance assuming a 
common covariance operator. 

On the other hand, if in the multivariate case the fg densities are assumed to be Gaussian 
with different means ma;^ and different covariance matrices C^g, the Bayes rule classifies Xq 
in class g if: 

dM{xo,mxg) +\og\Cxg\-2log7ig 

is minimum, where in this case, dM {xo,mxg) = (^Xq — nixg) C^^ (^Xq — nixg) , is the 
squared Mahalanobis distance between Xq and ma, . Under the functional framework, the 
new observation xo is assigned to the class Go if: 

K 

4m(x, Kf + E ^''^(^sk) - 2 log TT,, (20) 

fc=i 

is minimum, where A^^, for k = 1, . . . ,K are the eigenvalues of the estimated covariance 
operators under class g, respectively, and K is the number of eigenf unctions used to compute 
the functional Mahalanobis semi-distances. 

It is important to note that although the functional linear and quadratic classification 



Bayes rules in (19) and (20) have been derived using the functional Mahalanobis semi- 
distance, these methods essentially consists in applying the multivariate linear and quadratic 
Bayes rules to the functional principal components scores, that are multivariate random 



variables. Hall et al. (2001) proposed to use the Bayes classification rule in (18) after 
estimating nonparametrically the density function of the functional principal components 
scores. However, these authors pointed out that a computationally less expensive method is 
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to use the multivariate quadratic Bayes classification rule which is essentially the rule given 



in (20). 



4 Empirical results 

This section illustrates the performance of the functional classification procedures presented 
in Section 3 through several Monte Carlo simulations using four different scenarios and the 
analysis of two real datasets. 

4.1 Monte Carlo Study 

The Monte Carlo study considers four different scenarios. The first scenario consists in 
two Gaussian processes defined in the closed interval / = [0,1], with different means, 
yUi(t) = 20t^'^(l — t) and /i2(t) = 20t(l — t)^'^, respectively, and a common covariance 
operator with eigenfunctions ^k (t) = "\/2 sin ((/c — 0.5) vrt) and associated eigenvalues A^ = 
1/ {{k — 0.5) vr) , for k = 1,2, . . . Then, 1000 datasets are generated composed of rii functions 
from the first process and n2 functions from the second process such that n = rii + n2 is the 
whole sample size. The generated functions are observed at J equidistant points of the closed 
interval I = [0, 1], where J is either 50 or 100. A Gaussian noise of variance 0.01 is added to 
each generated point. Then, once a dataset is generated in this way, the sample is split in a 
training sample and a test sample. The training sample is composed of riio functions of the 
first process and n2o functions of the second process, while the test sample is composed of riu 
functions of the first process and n2i functions of the second process such that niQ + rin = rii 
and n2o + ^21 = "^2, respectively. In particular, two different configurations are considered. 
In the first one, n = 200 with rii = n2 = 100 and riio = n2o = 75, respectively. In the second 
one, n = 300 with rii = n2 = 150, and uiq = ^20 = 120, respectively. 

The second scenario is similar to the first one but the eigenvalues of the covariance 
operator are given by Ai^ = l/((fc — 0.5)7r)^ and A2A: = 2/((A; — 0.5)7r)^, for k = 1,2, . . ., 
for the first and second processes, respectively. Finally, the third and fourth scenarios are 
similar to the first and second ones but replacing the Gaussian process with a standardized 
exponential process with rate 1 and with the same mean functions and covariance operators. 
The discrete trajectories are converted to functional observations using a B-splines basis 
of order 6 with 20 basis functions that are enough to fit well the data. Figure 1 shows 
four datasets, once smoothing has been performed, corresponding to the four situations 
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Figure 1: B-splines approximations of datasets corresponding to the four experiments con- 
sidered. There are 10 functions per generated process. 

considered. As it can be seen in the figure, the four scenarios appear to be comphcated 
scenarios for classification purposes. 

For each generated dataset, the functional observations in the test sample are classified 
using the following procedures: (1) the kNN procedure with seven different functional dis- 
tances, the Li, L2 and L^ distances as proposed by Baillo et al. (2011), the functional 
principal components (FPC) semi-distance assuming either a common or a different covari- 
ance operator, denoted by FPCc and FPCd, respectively, and the functional Mahalanobis 
(FM) semi-distance assuming either a common or a different covariance operator, denoted 
by FMc and FMjj, respectively, as proposed in Section 3; (2) the centroid procedure with 
eight different functional distances, the first seven as in the kNN procedure and the dis- 



tance proposed by Delaigle and Hall (2012) given in (17) and denoted by DH; (3) the linear 



and quadratic Bayes classification rules as proposed in Section 3, denoted by FLBCR and 
FQBCR, respectively; and (4) the multivariate linear and quadratic Bayes classification 
rules applied on the coefficients of the B-splines basis representation, denoted by LBCR 
Coef. and QBCR Coef., respectively. This method can be seen as a simplification of the 
method proposed by James and Hastie (2001) much easier to implement than the original 
method. The threshold values needed to compute the FPCc, FPCd, FMc, FMd and 
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DH semi-distances and the FLBCR and FQBCR methods, and the maximum number of 
neighbors in the kNN procedures are determined using cross-vahdation with a maximum 
number of 15 eigenfunctions and 9 neighbors, respectively. Tables 1, 3, 5 and 7 show the 
proportion of correct classification of the test samples for the four scenarios. More precisely, 
each cell in the table displays the mean and the standard deviation (between parentheses) 
of the proportion of correct classifications over the 1000 Monte Carlo samples. On the other 
hand. Tables 2, 4, 6 and 8 show the means and standard deviations (between parentheses) of 
the optimal number of principal components needed to compute the FPCc, FPCd, FMq, 
FMd and DH semi-distances and the FLBCR and FQBCR methods. In view of these 
tables, several comments are in order. First, in most of the cases, the kNN procedure with 
the FMc semi-distance attains the largest proportion of correct classifications. Second, the 
proportions of correct classifications for the third and fourth scenarios are larger than the 
corresponding proportions for the first and second scenarios suggesting that Gaussianity is 
not necessarily an advantage for the functional Mahalanobis semi-distance. Third, in all 
the situations, classification methods in conjunction with the functional Mahalanobis semi- 
distance have a better performance than in conjunction with any other functional distance 
or semi-distance or any other alternative method as the one based on the basis functions 
coefficients. Fourth, there is not much difference in the results in terms of number of points 
in the grid and sample size. Fifth, at least in these scenarios, the use of the FPCo and 
FMd semi-distances is not of practical advantage. Indeed, even if the generated processes 
have different covariance operators, the methods appear to work better assuming a common 
covariance operator. Sixth, note that the multivariate quadratic Bayes classification rule for 
the coefficients of the Basis expansion has a bad performance in all the situations. This is 
probably due to the large amount of parameters that is necessary to estimate. Dimension 
reduction as done in James and Hastie (2001) may be a solution but at the cost of increas- 
ing the complexity of the procedure. In this sense note that very simple methods provides 
with very good performances without a high level of sophistication. Seventh, note also that, 
in most of the situations, standard deviations of good classification rates linked to method 
based on the functional Mahalanobis semi-distance are smaller than using any other alter- 
native. Finally, note that there is no a clear pattern relative to the number of functional 
principal components used with the FPCc, FPCd-, FMc-, FMo and DH semi-distances 
nor with the FLBCR and FQBCR methods. In summary, this limited simulation analysis 
appears to confirm that the functional Mahalanobis semi-distance may be a useful tool for 
classifying functional observations. 



Table 1: Proportion of correct classification for the first scenario 



n 


J 


Method 


L' 


L^ 


^oo 


FPCc 


FPCo 


FMc 


FMd 


DH 


— 






kNN 


.7657 


.7655 


.7682 


.7866 


.7871 


.8314 


.8209 


— 


— 








(.0550) 


(.0547) 


(.0574) 


(.0525) 


(.0513) 


(.0444) 


(.0513) 










Centroid 


.6710 


.6823 


.6764 


.6868 


.6907 


.8326 


.8145 


.8017 


— 








(.0870) 


(.0863) 


(.0854) 


(.0860) 


(.0861) 


(.0490) 


(.0480) 


(.0570) 




200 


50 


FLBCR 


— 


— 


— 


— 


— 


— 


— 


— 


.8326 

(.0490) 






FQBCR 


— 


— 


— 


— 


— 


— 


— 


— 


.8145 

(.0480) 






LBCR Coef. 


— 


— 


— 


— 


— 


— 


— 


— 


.8201 

(.0564) 






QBCR Coef. 


— 


— 


— 


— 


— 


— 


— 


— 


.7135 

(.0708) 






kNN 


.7700 


.7721 


.7744 


.7924 


.7918 


.8359 


.8220 


— 


— 








(.0584) 


(.0588) 


(.0553) 


(.0584) 


(.0570) 


(.0463) 


(.0570) 










Centroid 


.6806 


.6869 


.6837 


.6916 


.6947 


.8339 


.8174 


.8061 


— 








(.0920) 


(.0832) 


(.0806) 


(.0817) 


(.0824) 


(.0542) 


(.0539) 


(.0625) 




200 


100 


FLBCR 


— 


— 


— 


— 


— 


— 


— 


— 


.8339 

(.0542) 






FQBCR 


— 


— 


— 


— 


— 


— 


— 


— 


.8174 

(.0539) 






LBCR Coef. 


— 


— 


— 


— 


— 


— 


— 


— 


.8254 

(.0552) 






QBCR Coef. 


— 


— 


— 


— 


— 


— 


— 


— 


.7255 

(.0646) 






kNN 


.7710 


.7745 


.7834 


.7985 


.7975 


.8335 


.8233 


— 


— 








(.0523) 


(.0524) 


(.0490) 


(.0496) 


(.0510) 


(.0452) 


(.0510) 










Centroid 


.6771 


.6853 


.6835 


.6897 


.6915 


.8350 


.8239 


.8049 


— 








(.0794) 


(.0752) 


(.0729) 


(.0761) 


(.0762) 


(.0468) 


(.0457) 


(.0536) 




300 


50 


FLBCR 


— 


— 


— 


— 


— 


— 


— 


— 


.8350 

(.0468) 






FQBCR 


— 


— 


— 


— 


— 


— 


— 


— 


.8239 

(.0457) 






LBCR Coef. 


— 


— 


— 


— 


— 


— 


— 


— 


.8325 

(.0488) 






QBCR Coef. 


— 


— 


— 


— 


— 


— 


— 


— 


.7660 

(.0545) 



300 100 



kNN 
Centroid 
FLBCR 
FQBCR 

LBCR Coef. 

QBCR Coef. 



.7751 .7766 .7826 .7948 .7943 .8378 .8225 

(.0529) (.0521) (.0520) (.0523) (.0492) (.0450) (.0492) 

.6935 .6906 .6925 .6936 .6966 .8348 .8231 .8063 

(.0838) (.0702) (.0713) (.0695) (.0693) (.0448) (.0445) (.0529) 



.8348 

(.0448) 

.8231 

(.0445) 

.8290 

(.0489) 

.7630 

(.0545) 
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Table 2: Means and standard deviations (between parentheses) of the optimal number of 
principal components needed to compute the FPCc, FPCd, FMc-, FMo and DH semi- 
distances and the FLBCR and FQBCR methods 





FPCc 


FPCd 


FMc 


FMd 


DH 


— 


kNN 


6.36 

(2.87) 


6.53 

(2.97) 


7.48 

(2.90) 


7.06 

(2.82) 


— 


— 


Centroid 


4.16 

(2.09) 


4.99 

(2.52) 


7.45 

(2.99) 


6.50 

(2.87) 


6.48 

(3.06) 


— 


FLBCR 


— 


— 


— 


— 


— 


7.45 

(2.99) 


FQBCR 


— 


— 


— 


— 


— 


6.50 

(2.87) 


kNN 


6.20 

(2.63) 


6.67 

(2.68) 


7.35 

(2.85) 


6.49 

(2.83) 


— 


— 


Centroid 


4.05 

(2.05) 


4.66 

(2.34) 


7.36 

(2.93) 


6.32 

(2.73) 


6.86 

(3.08) 


— 


FLBCR 


— 


— 


— 


— 


— 


7.36 

(2.93) 


FLBCR 


— 


— 


— 


— 


— 


6.32 

(2.73) 


kNN 


6.57 

(2.87) 


6.69 

(2.90) 


7.40 

(2.95) 


7.21 

(2.83) 


— 


— 


Centroid 


4.38 

(2.17) 


4.87 

(2.49) 


7.48 

(3.11) 


6.46 

(2.87) 


6.51 

(3.00) 


— 


FLBCR 


— 


— 


— 


— 


— 


7.48 

(3.11) 


FQBCR 


— 


— 


— 


— 


— 


6.46 

(2.87) 



kNN 6.50 6.89 7.53 6.93 

(2.73) (2.87) (2.78) (2.87) 

Centroid 4.39 4.61 7.83 6.67 6.94 

(2.09) (2.18) (2.73) (2.86) (2.97) 

FLBCR _____ 7.83 

(2.73) 

FQBCR _____ 6.67 

(2.86) 
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Table 3: Proportion of correct classification for the second scenario 



n 


J 


Method 


L' 


L^ 


^oo 


FPCc 


FPCd 


FMc 


FMd 


DH 


— 






kNN 


.7452 


.7459 


.7353 


.7718 


.7718 


.8055 


.7430 


— 


— 








(.0556) 


(.0555) 


(.0543) 


(.540) 


(.0525) 


(.0474) 


(.0525) 










Centroid 


.6337 


.6415 


.6430 


.6469 


.6497 


.7910 


.7130 


.7544 


— 








(.0783) 


(.0774) 


(.0734) 


(.0774) 


(.0783) 


(.0549) 


(.0554) 


(.0754) 




200 


50 


FLBCR 


— 


— 


— 


— 


— 


— 


— 


— 


.7910 

(.0549) 






FQBCR 


— 


— 


— 


— 


— 


— 


— 


— 


.7130 

(.0554) 






LBCR Coef. 


— 


— 


— 


— 


— 


— 


— 


— 


.7753 

(.0616) 






QBCR Coef. 


— 


— 


— 


— 


— 


— 


— 


— 


.5817 

(.0502) 






kNN 


.7433 


.7433 


.7386 


.7738 


.7747 


.8058 


.7407 


— 


— 








(.0550) 


(.0516) 


(.0481) 


(.0494) 


(.0469) 


(.0445) 


(.0469) 










Centroid 


.6350 


.6439 


.6513 


.6531 


.6566 


.7928 


.7207 


.7578 


— 








(.0869) 


(.0885) 


(.0813) 


(.0868) 


(.0865) 


(.0528) 


(.0582) 


(.0637) 




200 


100 


FLBCR 


— 


— 


— 


— 


— 


— 


— 


— 


.7928 

(.0528) 






FQBCR 


— 


— 


— 


— 


— 


— 


— 


— 


.7207 

(.0582) 






LBCR Coef. 


— 


— 


— 


— 


— 


— 


— 


— 


.7805 

(.0581) 






QBCR Coef. 


— 


— 


— 


— 


— 


— 


— 


— 


.5748 

(.0514) 






kNN 


.7543 


.7550 


.7500 


.7833 


.7825 


.8064 


.7359 


— 


— 








(.0475) 


(.0455) 


(.0460) 


(.0438) 


(.0433) 


(.0397) 


(.0433) 










Centroid 


.6552 


.6615 


.6620 


.6679 


.6704 


.7940 


.7125 


.7618 


— 








(.0714) 


(.0705) 


(.0752) 


(.0695) 


(.0708) 


(.0494) 


(.0548) 


(.0550) 




300 


50 


FLBCR 


— 


— 


— 


— 


— 


— 


— 


— 


.7940 

(.0494) 






FQBCR 


— 


— 


— 


— 


— 


— 


— 


— 


.7125 

(.0548) 






LBCR Coef. 


— 


— 


— 


— 


— 


— 


— 


— 


.7897 

(.0533) 






QBCR Coef. 


— 


— 


— 


— 


— 


— 


— 


— 


.5604 

(.0374) 



300 100 



kNN 
Centroid 
FLBCR 
FQBCR 

LBCR Coef. 

QBCR Coef. 



.7538 .7563 .7516 .7826 .7836 

(.0490) (.0510) (.0490) (.0488) (.0492) 

.6499 .6601 .6651 .6650 .6681 

(.0754) (.0761) (.0740) (.0761) (.0760) 



.8098 .7385 

(.0408) (.0492) 

.7967 .7097 .7650 

(.0504) (.0524) (.0594) 



.7967 

(.0504) 

.7097 

(.0524) 

.7805 

(.0490) 

.5623 

(.0401) 
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Table 4: Means and standard deviations (between parentheses) of the optimal number of 
principal components needed to compute the FPCc, FPCd, FMc-, FMd and DH semi- 
distances for the second scenario and the FLBCR and FQBCR methods 





FPCc 


FPCd 


FMc 


FMd 


DH 


— 


kNN 


6.46 

(2.88) 


6.11 

(2.77) 


6.11 

(2.53) 


4.94 

(2.69) 


— 


— 


Centroid 


3.78 

(2.00) 


4.33 

(2.49) 


6.89 

(2.98) 


3.81 

(1.75) 


6.65 

(3.08) 


— 


FLBCR 


— 


— 


— 


— 


— 


6.89 

(2.98) 


FQBCR 


— 


— 


— 


— 


— 


3.81 

(1.75) 


kNN 


6.03 

(2.75) 


5.95 

(2.71) 


5.99 

(2.44) 


4.98 

(2.53) 


— 


— 


Centroid 


3.77 

(1.89) 


4.38 

(2.41) 


6.96 

(2.92) 


3.62 

(1.67) 


6.52 

(3.16) 


— 


FLBCR 


— 


— 


— 


— 


— 


6.96 

(2.92) 


FQBCR 


— 


— 


— 


— 


— 


3.62 

(1.67) 


kNN 


6.40 

(2.67) 


6.33 

(2.66) 


5.95 

(2.37) 


4.76 

(2.24) 


— 


— 


Centroid 


3.99 

(1.97) 


4.53 

(2.47) 


7.62 

(2.93) 


3.44 

(1.20) 


6.81 

(3.11) 


— 


FLBCR 


— 


— 


— 


— 


— 


7.62 

(2.93) 


FQBCR 


— 


— 


— 


— 


— 


3.44 

(1.20) 



kNN 6.06 6.16 6.40 4.48 

(2.91) (2.74) (2.51) (2.26) 

Centroid 4.07 4.82 7.41 3.32 7.10 

(1.99) (2.55) (2.90) (1.19) (3.20) 

FLBCR _____ 7.41 

(2.90) 

FQBCR _____ 3.32 

(1.19) 
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Table 5: Proportion of correct classification for the third scenario 



n 


J 


Method 


L^ 


L' 


^oo 


FPCc 


FPCo 


FMc 


FMd 


DH 


— 






kNN 


.8344 


.8397 


.8408 


.8654 


.8646 


.8999 


.8842 


— 


— 








(.0498) 


(.0504) 


(.0439) 


(.0445) 


(.0437) 


(.0379) 


(.0437) 










Centroid 


.6738 


.7052 


.7131 


.7093 


.7128 


.8392 


.8165 


.8099 


— 








(.0847) 


(.0857) 


(.0786) 


(.0867) 


(.0860) 


(.0500) 


(.0531) 


(.0536) 




200 


50 


FLBCR 


— 


— 


— 


— 


— 


— 


— 


— 


.8392 

(.0500) 






FQBCR 


— 


— 


— 


— 


— 


— 


— 


— 


.8165 

(.0531) 






LBCR Coef. 


— 


— 


— 


— 


— 


— 


— 


— 


.8264 

(.0537) 






QBCR Coef. 


— 


— 


— 


— 


— 


— 


— 


— 


.7186 

(.0649) 






kNN 


.8409 


.8453 


.8444 


.8669 


.8670 


.9050 


.8877 


— 


— 








(.0477) 


(.0469) 


(.0488) 


(.0436) 


(.0434) 


(.0365) 


(.0434) 










Centroid 


.6819 


.7060 


.7183 


.7100 


.7147 


.8464 


.8240 


.8152 


— 








(.0927) 


(.0944) 


(.0917) 


(0.954) 


(.0949) 


(.0512) 


(.0513) 


(.0574) 




200 


100 


FLBCR 


— 


— 


— 


— 


— 


— 


— 


— 


.8464 

(.0512) 






FQBCR 


— 


— 


— 


— 


— 


— 


— 


— 


.8240 

(.0513) 






LBCR Coef. 


— 


— 


— 


— 


— 


— 


— 


— 


.8342 

(.0574) 






QBCR Coef. 


— 


— 


— 


— 


— 


— 


— 


— 


.7259 

(.0648) 






kNN 


.8544 


.8596 


.8604 


.8821 


.8794 


.9086 


.8984 


— 


— 








(.0408) 


(.0409) 


(.0407) 


(.0368) 


(.0364) 


(.0294) 


(.0364) 










Centroid 


.6957 


.7227 


.7228 


.7280 


.7310 


.8484 


.8317 


.8223 


— 








(.0806) 


(.0754) 


(.0704) 


(.0759) 


(.0764) 


(.0456) 


(.0421) 


(.0499) 




300 


50 


FLBCR 


— 


— 


— 


— 


— 


— 


— 


— 


.8484 

(.0456) 






FQBCR 


— 


— 


— 


— 


— 


— 


— 


— 


.8317 

(.0421) 






LBCR Coef. 


— 


— 


— 


— 


— 


— 


— 


— 


.8415 

(.0523) 






QBCR Coef. 


— 


— 


— 


— 


— 


— 


— 


— 


.7644 

(.0576) 



300 100 



kNN 
Centroid 
FLBCR 
FQBCR 

LBCR Coef. 

QBCR Coef. 



.8570 .8640 .8621 .8864 .8861 .9119 .9023 

(.0405) (.0389) (.0433) (.0346) (.0354) (.0338) (.0354) 

.7065 .7340 .7363 .7378 .7421 .8503 .8299 .8245 

(.0868) (.0834) (.0721) (.0838) (.0834) (.0461) (.0455) (.0526) 



.8503 

(.0461) 

.8299 

(.0455) 

.8464 

(.0495) 

.7623 

(.0565) 
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Table 6: Means and standard deviations (between parentheses) of the optimal number of 
principal components needed to compute the FPCc, FPCd, FMc-, FMd and DH semi- 
distances for the third scenario and the FLBCR and FQBCR methods 





FPCc 


FPCd 


FMc 


FMd 


DH 


— 


kNN 


5.71 

(2.84) 


6.04 

(2.74) 


5.35 

(2.59) 


5.00 

(2.45) 


— 


— 


Centroid 


4.25 

(2.24) 


4.73 

(2.26) 


7.20 

(2.77) 


6.09 

(2.70) 


6.42 

(2.74) 


— 


FLBCR 


— 


— 


— 


— 


— 


7.20 

(2.77) 


FQBCR 


— 


— 


— 


— 


— 


6.09 

(2.70) 


kNN 


5.62 

(2.71) 


5.98 

(2.77) 


5.14 

(2.56) 


4.92 

(2.39) 


— 


— 


Centroid 


3.91 

(1.98) 


4.66 

(2.43) 


6.68 

(2.76) 


5.67 

(2.60) 


6.46 

(3.00) 


— 


FLBCR 


— 


— 


— 


— 


— 


6.68 

(2.76) 


FQBCR 


— 


— 


— 


— 


— 


5.67 

(2.60) 


kNN 


6.10 

(2.84) 


6.29 

(2.89) 


4.93 

(2.37) 


4.78 

(2.26) 


— 


— 


Centroid 


4.22 

(2.20) 


4.84 

(2.55) 


7.14 

(2.81) 


6.04 

(2.79) 


7.03 

(2.92) 


— 


FLBCR 


— 


— 


— 


— 


— 


7.14 

(2.81) 


FQBCR 


— 


— 


— 


— 


— 


6.04 

(2.79) 



kNN 5.91 6.52 4.58 4.52 

(2.75) (2.77) (2.21) (1.94) 

Centroid 4.34 5.08 7.23 6.17 6.75 

(1.99) (2.40) (2.97) (2.85) (2.91) 

FLBCR _____ 7.23 

(2.97) 

FQBCR _____ 6.17 

(2.85) 
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Table 7: Proportion of correct classification for the fourth scenario 



n 


J 


Method 


L^ 


L' 


^oo 


FPCc 


FPCo 


FMc 


FMd 


DH 


— 






kNN 


.8645 


.8647 


.8517 


.8851 


.8830 


.9212 


.8619 


— 


— 








(.0489) 


(.0485) 


(.0461) 


(.0464) 


(.0457) 


(.0349) 


(.0457) 










Centroid 


.7328 


.7213 


.6923 


.7234 


.7262 


.8939 


.8386 


.8679 


— 








(.0849) 


(.0892) 


(.0903) 


(.0897) 


(.0887) 


(.0424) 


(.0478) 


(.0447) 




200 


50 


FLBCR 


— 


— 


— 


— 


— 


— 


— 


— 


.8939 

(.0424) 






FQBCR 


— 


— 


— 


— 


— 


— 


— 


— 


.8386 

(.0478) 






LBCR Coef. 


— 


— 


— 


— 


— 


— 


— 


— 


.8913 

(.0450) 






QBCR Coef. 


— 


— 


— 


— 


— 


— 


— 


— 


.7173 

(.0668) 






kNN 


.8669 


.8712 


.8576 


.8874 


.8872 


.9223 


.8599 


— 


— 








(.0465) 


(.0447) 


(.0490) 


(.0414) 


(.0425) 


(.0330) 


(.0425) 










Centroid 


.7345 


.7291 


.6969 


.7316 


.7337 


.8930 


.8335 


.8681 


— 








(.0846) 


(.0856) 


(.0861) 


(.0861) 


(.0859) 


(.0398) 


(.0456) 


(.0455) 




200 


100 


FLBCR 


— 


— 


— 


— 


— 


— 


— 


— 


.8930 

(.0398) 






FQBCR 


— 


— 


— 


— 


— 


— 


— 


— 


.8335 

(.0456) 






LBCR Coef. 


— 


— 


— 


— 


— 


— 


— 


— 


.8912 

(.0412) 






QBCR Coef. 


— 


— 


— 


— 


— 


— 


— 


— 


.7077 

(.0632) 






kNN 


.8803 


.8854 


.8713 


.9006 


.8987 


.9265 


.8670 


— 


— 








(.0407) 


(.0419) 


(.0430) 


(.0383) 


(.0387) 


(.0316) 


(.0387) 










Centroid 


.7303 


.7250 


.7040 


.7268 


.7293 


.8947 


.8340 


.8647 


— 








(.0688) 


(.0747) 


(.0800) 


(.0749) 


(.0741) 


(.0379) 


(.0445) 


(.0479) 




300 


50 


FLBCR 


— 


— 


— 


— 


— 


— 


— 


— 


.8947 

(.0379) 






FQBCR 


— 


— 


— 


— 


— 


— 


— 


— 


.8340 

(.0445) 






LBCR Coef. 


— 


— 


— 


— 


— 


— 


— 


— 


.9044 

(.0389) 






QBCR Coef. 


— 


— 


— 


— 


— 


— 


— 


— 


.7270 

(.0589) 



300 100 



kNN 
Centroid 
FLBCR 
FQBCR 

LBCR Coef. 

QBCR Coef. 



.8795 .8797 .8632 .8974 .8960 .9279 .8710 

(.0420) (.0442) (.0427) (.0386) (.0392) (.0318) (0.392) 

.7346 .7289 .7015 .7314 .7330 .8936 .8324 .8649 

(.0717) (.0797) (.0818) (.0795) (.0796) (.0384) (.0441) (.0455) 



.8936 

(.0384) 

.8324 

(.0441) 

.9010 

(.0402) 

.7215 

(.0613) 
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Table 8: Means and standard deviations (between parentheses) of the optimal number of 
principal components needed to compute the FPCc, FPCd, FMc-, FMd and DH semi- 
distances for the fourth scenario and the FLBCR and FQBCR methods 





FPCc 


FPCd 


FMc 


FMd 


DH 


— 


kNN 


6.41 

(2.99) 


6.51 

(2.83) 


6.49 

(3.00) 


3.72 

(1.56) 


— 


— 


Centroid 


4.87 

(2.51) 


5.23 

(2.72) 


8.26 

(2.65) 


5.06 

(2.49) 


7.54 

(2.77) 


— 


FLBCR 


— 


— 


— 


— 


— 


8.26 

(2.65) 


FQBCR 


— 


— 


— 


— 


— 


5.06 

(2.49) 


kNN 


6.52 

(2.83) 


7.02 

(2.79) 


6.47 

(2.94) 


3.71 

(1.73) 


— 


— 


Centroid 


5.07 

(2.53) 


5.51 

(2.65) 


8.32 

(2.67) 


4.62 

(2.38) 


'~7 '~7'~7 
l.ll 

(2.89) 


— 


FLBCR 


— 


— 


— 


— 


— 


8.32 

(2.67) 


FQBCR 


— 


— 


— 


— 


— 


4.62 

(2. ,38) 


kNN 


6.96 

(2.89) 


7.11 

(2.84) 


6.16 

(2.90) 


3.50 

(1.27) 


— 


— 


Centroid 


5.00 

(2.39) 


5.28 

(2.31) 


8.51 

(2.64) 


4.73 

(2.14) 


7.51 

(2.90) 


— 


FLBCR 


— 


— 


— 


— 


— 


8.51 

(2.64) 


FQBCR 


— 


— 


— 


— 


— 


4.73 

(2.14) 



kNN 6.64 7.04 6.42 3.37 

(2.72) (2.85) (2.94) (1.09) 

Centroid 4.90 5.55 8.42 4.96 7.30 

(2.20) (2.67) (2.55) (2.25) (2.89) 

FLBCR _____ 8.42 

(2.55) 

FQBCR _____ 4.96 

(2.25) 
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4.2 Real data study: Tecator dataset 

Next, the classification procedures are applied to the Tecator dataset previously considered 
by Ferraty and Vieu (2003), Rossi and Villa (2006), Li and Yu (2008), Alonso et al. (2012) 
and Martin-Barragan et al. (2013), among others. The dataset that consists of 215 near- 
infrared absorbance spectra of meat samples, recorded on a Tecator Infracted Food Analyzer 



is available at http://lib.stat.cmu.edu/datasets/tecator The absorbance of a meat sample 



is a function given by logio [Iq/I) where Iq and / are, respectively, the intensity of the light 
before and after passing through of the meat sample. Each observation consist of a 100- 
channel absorbance spectrum in the wavelength range 850-1050 nm, contents of moisture 
(water), fat and protein. Therefore, the recorded absorbance can be seen as a discretized 
version of the continuous process. The classification problem here is to separate meat samples 
with a high fat content (more than 20%) from samples with low fat content (less than 20%) 
based on the absorbance. Among the 215 samples, 77 have high fat content and 138 have low 
fat content. Previous analysis of this dataset have suggested that classification of the second 
order derivatives of the observed functions produces lower misclassification rates. Therefore, 
the analysis of the original data and their second order derivatives are carried out. In both 
cases, the discrete observations are converted to functional observations using a B-splines 
basis of order 6 with 20 and 40 basis functions, respectively, that are enough to fit well the 
data. Figure 2 shows the sample of these 100-channel absorbance spectrum and their second 
derivatives after smoothing. 

In order to evaluate the performance of the functional classification methods given before, 
1000 training samples are considered composed by 58 and 104 randomly chosen functions 
of meat with high fat content and low fat content, respectively. For each training sample, 
it is associated a test sample composed by the remaining 19 and 34 functions of meat with 
high fat content and low fat content, respectively. The classification results are shown in 
Tables 9 and 11 that show the mean and the standard deviation (between parentheses) of the 
proportion of correct classifications obtained via cross-validation for the two cases. As in the 
simulation study, the threshold values needed to compute the FPCc, FPCd, FMq, FMd 
and DH semi- distances and the FLBCR and FQBCR methods, and the maximum number 
of neighbors in the kNN procedures are determined using cross-validation with a maximum 
number of 15 eigenfunctions and 9 neighbors, respectively. In both cases, the kNN procedure 
with the FMc semi-distance is the winner. The highest proportions of correct classification 
for the Tecator dataset and the second order derivatives are 0.9835 and 0.9918, respectively, 
suggesting that it is not necessary to use the second order derivatives of the Tecator data to 
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Tecator dataset 



Second order differencing 





Figure 2: Right: Original observations of the Tecator dataset. Left: Second order derivatives 
of the Tecator dataset. High fat content in black and low fat content in gray 

obtain almost perfect classification. Note that using a similar experiment, Rossi and Villa 
(2006) obtained good classification rates of 0.9672 and 0.9740 for the original and second 
order derivatives with SVMs, respectively, Li and Yu (2008) obtained good classification 
rates of 0.9602 and 0.9891 for the original and second order derivatives with a segmentation 
approach, respectively, Alonso et al. (2012) obtained good classification rates of 0.9798 and 
0.9768, respectively, with two methods that takes into account the original, the first and 
the second order derivatives, and, finally, Martin-Barragan et al. (2013) obtained a good 
classification rate of 0.9891 with SVMs. Note that all of the previous approaches are more 
sophisticated than the ones taken here. 

On the other hand. Tables 10 and 12 show the means and the standard deviation (be- 
tween parentheses) of the number of principal components needed to calculate the FPCc, 
FPCd, FMc, FMd and DH semi-distances and the FLBCR and FQBCR methods for 
the original dataset and their second order derivatives. The mean numbers of functional 
principal components used with the functional Mahalanobis semi-distance are slightly larger 
than the corresponding to the functional principal components and Delaigle and Hall semi- 
distances if the original dataset is used but are sometimes smaller for their second order 
derivatives. Therefore, apparently there is not a general rule regarding the number of prin- 
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Table 9: Proportion of correct classification for the Tecator dataset 





L' 


L'' 


L°° 


FPCc 


FPCd 


FMc 


FMd 


DH 


— 


kNN 


.7904 

(.0368) 


.8108 

(.0371) 


.8602 

(.0342) 


.8144 

(.0364) 


.8135 

(.0363) 


.9835 

(.0114) 


.9714 

(.0363) 


— 


— 


Centroid 


.6784 

(.0343) 


.6812 

(.0347) 


.6957 

(.0346) 


.6813 

(.0347) 


.6813 

(.0348) 


.9630 

(.0173) 


.9521 

(.0218) 


.9479 

(.0322) 


— 


FLBCR 


— 


— 


— 


— 


— 


— 


— 


— 


.9517 

(.0196) 


FQBCR 


— 


— 


— 


— 


— 


— 


— 


— 


.9671 

(.0172) 


LBCR Coef. 


— 


— 


— 


— 


— 


— 


— 


— 


.9244 

(.0244) 


QBCR Coef. 


— 


— 


— 


— 


— 


— 


— 


— 


.8958 

(.0325) 



Table 10: Means and standard deviations of the number of principal components used by 
the FPCc, FPCd, FMc, FMd and DH semi-distances and the FLBCR and FQBCR 
methods for the Tecator dataset 





FPCc 


FPCd 


FMc 


FMd 


DH 


— 


kNN 


4.19 


4.89 


4.86 


5.12 


— 


— 




(.073) 


(1.54) 


(1.01) 


(1.18) 






Centroid 


1.48 


1.52 


4.82 


5.20 


5.05 


— 




(0.98) 


(1.09) 


(0.94) 


(1.24) 


(1.47) 




FLBCR 


— 


— 


— 


— 


— 


5.01 

(1.10) 


FQBCR 


— 


— 


— 


— 


— 


5.15 

(1.10) 



cipal components used. 

4.3 Real data study: Phoneme dataset 

Finally, the classification procedures are applied to the Phoneme dataset described in Ferraty 



and Vieu (2006) and available at http://www.math.univ-toulouse.fr/staph/npfda/npfda- 



datasets.html[ The dataset contains log-periodograms corresponding to recordings of speak- 



ers of 32 ms duration. Here, two populations are considered corresponding to the phonemes 
"aa" as the vowel in "dark" and "ao" as the first vowel in "water", such that each speech 
frame is represented by 400 samples at a 16-kHz sampling rate where only the first 150 fre- 
quencies from each subject are retained. Therefore, the data consists of 800 log-periodograms 
of length 150, with known class phoneme membership. The classification problem here is to 
separate the two phonemes. The discrete observations are converted to functional observa- 
tions using a B-splines basis of order 6 with 40 basis functions, respectively, that are enough 
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Table 11: Proportion of correct classificat 


ion for the second order differences of the Tecator 


dataset 






















Li 


L2 


^oo 


FPCc 


FPCd 


FMc 


FMd 


DH 


— 


kNN 


.9885 


.9852 


.9814 


.9901 


.9870 


.9918 


.9664 


— 


— 




(.0091) 


(.0099) 


(.0109) 


(.0080) 


(.0094) 


(.0076) 


(.0094) 






Centroid 


.9629 


.9608 


.9546 


.9651 


.9617 


.9678 


.9372 


.9630 


— 




(.0200) 


(.0210) 


(.0217) 


(.0190) 


(.0206) 


(.0180) 


(.0253) 


(.0201) 




FLBCR 


— 


— 


— 


— 


— 


— 


— 


— 


.9533 

(.0195) 


FQBCR 


— 


— 


— 


— 


— 


— 


— 


— 


.9555 

(.0190) 


LBCR Coef. 


— 


— 


— 


— 


— 


— 


— 


— 


.9218 

(.0261) 


QBCR Coef. 


— 


— 


— 


— 


— 


— 


— 


— 


.7220 

(.0581) 



Table 12: Means and standard deviations of the number of principal components used by 
the FPCc, FPCd, FMc, FMd and DH semi-distances and the FLBCR and FQBCR 
methods for the second order derivatives of the Tecator dataset 





FPCc 


FPCd 


FMc 


FMd 


DH 


— 


kNN 


2.22 


3.78 


2.67 


2.05 


— 


— 




(0.62) 


(1.99) 


(1.83) 


(1.00) 






Centroid 


1.63 


3.35 


2.99 


1.66 


3.59 


— 




(0.63) 


(1.73) 


(2.96) 


(1.43) 


(3.47) 




FLBCR 


— 


— 


— 


— 


— 


4.24 

(3.85) 


FQBCR 


— 


— 


— 


— 


— 


2.00 

(1.29) 
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Phoneme dataset 




Figure 3: Phoneme dataset. Log-periodograms for "aa" in black and log-periodograms for 
"ao" in gray. Note that the log-periodograms in gray hide most of the log-periodograms in 
black 

to fit well the data. Figure 3 shows the sample of log-periodograms. The figure confirms 
that it is difficult to distinguish the log-periodograms from one another. 

As in the previous example, 1000 training samples are considered composed by 300 ran- 
domly chosen log-periodograms of both vowels. For each training sample, it is associated a 
test sample composed by the remaining 200 log-periodograms, 100 per vowel, respectively. 
The classification results are shown in Table 13 that shows the mean and the standard de- 
viation (between parentheses) of the proportion of correct classifications obtained via cross- 
validation. As in the simulation study and the previous example, the threshold values needed 
to compute the FPCc, FPCd, FMc-, FMd and DH semi-distances and the FLBCR and 
FQBCR methods, and the maximum number of neighbors in the kNN procedures are deter- 
mined using cross-validation with a maximum number of 15 eigenfunctions and 9 neighbors, 
respectively. In this case, the centroid method with the FMc semi-distance is the winner. 
Note that this method coincides in this case with the functional linear Bayes classification 
rule. The highest proportion of correct classification for the Phoneme dataset is 0.8238 which 
is slightly larger than other alternatives. 

On the other hand. Table 14 shows the means and the standard deviation (between paren- 
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Table 13: Proportion of correct classification for the Phoneme dataset 





Li 


L^ 


^oo 


FPCc 


FPCd 


FMc 


FMd 


DH 


— 


kNN 


.7918 

(.0235) 


.7847 

(.0248) 


.7838 

(.0258) 


.7996 

(.0240) 


.7799 

(.0233) 


.8124 

(.0218) 


.7961 

(.0233) 


— 


— 


Centroid 


.7542 

(.0319) 


.7386 

(.0307) 


.7038 

(.0283) 


.7401 

(.0307) 


.7346 

(.0303) 


.8238 

(.0236) 


.7994 

(.0218) 


.8001 

(.0281) 


— 


FLBCR 


— 


— 


— 


— 


— 


— 


— 


— 


.8238 

(.0236) 


FQBCR 


— 


— 


— 


— 


— 


— 


— 


— 


.7994 

(.0218) 


LBCR Coef. 


— 


— 


— 


— 


— 


— 


— 


— 


.8050 

(.0250) 


QBCR Coef. 


— 


— 


— 


— 


— 


— 


— 


— 


.7802 

(.0261) 



Table 14: Means and standard deviations of the number of principal components used by 
the FPCc, FPCd, FMc, FMd and DH semi-distances and the FLBCR and FQBCR 
methods with the Phoneme dataset 





FPCc 


FPCd 


FMc 


FMd 


DH 


— 


kNN 


8.31 


9.37 


9.08 


9.48 


— 


— 




(2.87) 


(3.26) 


(2.82) 


(3.40) 






Centroid 


6.83 


8.68 


8.94 


8.04 


8.08 


— 




(2.79) 


(3.00) 


(1.83) 


(3.49) 


(2.33) 




FLBCR 


— 


— 


— 


— 


— 


8.94 

(1.83) 


FQBCR 


— 


— 


— 


— 


— 


8.04 

(3.49) 



theses) of the number of principal components needed to calculate the FPCc, FPCd, FMc, 
FMd and DH semi-distances and the FLBCR and FQBCR methods for the Phoneme 
dataset. The mean numbers of functional principal components used with the winner meth- 
ods is around 9. However, other methods with worst performance have also mean values 
close to 9. Therefore, in this case, the differences between performances are apparently due 
to the methods themselves. 

5 Conclusions 

This paper has introduced a new semi-distance for functional data that generalize the multi- 
variate Mahalanobis distance to the functional framework. For that, it is used the regularized 
square root inverse operator given in Mas (2007) that allows to write the functional Maha- 
lanobis semi-distance between an observation and the sample mean function of the set of 
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functions in terms of the standardize functional principal component scores. Afterwards, 
new versions of several classification procedures have been proposed based on the functional 
Mahalanobis semi-distance. Some Monte Carlo experiments and the analysis of two real 
data examples illustrate the good behavior of the classification methods based on the func- 
tional Mahalanobis semi-distance. As mentioned previously, the range of applications of the 
functional Mahalanobis semi-distance is large and includes clustering, hypothesis testing and 
outlier detection, among others. This would be the objective of future work. 
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Appendix 



Proof of Proposition 2.1 

From pj), it is possible to write: 



dpuix^^^x) = (r/(x-/^x)>r/(x-/ix) 



1/2 



k=l -^k k=l ^k 



1/2 



Now, from ^ and ([T]), the previous expression leads to: 



/ ^ 1 

"fm(X) ^^x) = ( /_^ ~[j^ 



{\ 



ifc=l ^^k 






^ 1 



1\ 



k=l ^^k 






1/2 



As the inner product is linear for 6^ and the V'fc are orthonormal eigenfunctions, it is possible 
to write: 



dpMix, f^x) = {Yl 717^^^' Yl TT/2'^k 

,k=l \ k=l ^k 



K 



1/2 



K 

,fc=l 



1/2 



Wt 
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Proof of Proposition 2.2 

By hypothesis, the two functions Xi aiid X2 have the same mean function, /x^, and the same 
covariance operator, F^. Therefore, from the Karhunen-Loeve expansion: 

Xl = IJ-x + ^^lki^k, 
k=l 

and, 

oo 
k=l 

where 9ik = (xi — /^x; i^k) and 02fc = (X2 — /^x' "^k), ior k = 1, . . . are the functional principal 
component scores of Xi and X2, respectively. Consequently, the difference between the two 
functions Xi and X2 can be written as: 

oo 

Xl - X2 = J2^e,k - e2k)iJk. (21) 

fc=i 

Using the expression (Is]) of the regularized square root inverse operator, the Mahalanobis 
semi-distance between xi and X2 is given by: 

/ -- -- \'^^^ 

c^fm(xi, X2) = {Tk' (xi - X2), r^^^ (xi -X2)f = 

5Z 7I7^(^^ ® V'fc)(Xi - X2), XI 7172 (^fc ® V'fc)(Xi - X2) ) 
u=l -^fc fc=l -^fc / 



Now, from mh and (21), the above expression can be written as: 



o^fM(xi, X2) = {y^ 7172 ^^^' ^1 ^ X2) ^fc, X^ -^7^ (^fc, Xl - X2) i^k 

\k=l ^k k=l ^k 

= ^Y\\^k^ X^(^ii - ^2i)V^j y V^fc, ( ^fc, X^(^ij - ^2i)^i Wfc y = 

y] -v- {{01k - d2k)'ipk, {dik - d2k)i'k) = y](wifc - (^2kf 
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where Uik = Oik/X// and U2k = 02k/ ^k , for A; = 1, 2, . . . are the standardized functional 
principal component scores of Xi ^"^^ X2, respectively. 

Proof of Proposition 2.3 

The proof of this proposition is trivial in view of Proposition 2.2 that asserts that dp]y,^{xi, X2) 
is just the Euclidean distance between the first K standardized functional principal com- 
ponent scores of xi ^'^^ X2- Note that c?^m(X1)X2) is not a functional distance because 
'^fm(Xi;X2) = if Xi and X2 have the same first K functional principal component scores, 
which does not imply Xi = X2- 

Proof of Theorem 2.1 

The functional Mahalanobis semi-distance between the Gaussian process x ciiid its mean 
function fi^ is given in ^. Now, as x is a Gaussian process, the standardized functional 
principal component scores, Uk, ior k = 1,2, . . . are independent standard Gaussian random 
variables (see. Ash and Gardner, 1975) that shows the result. 
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